翻訳と辞書
Words near each other
・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Jaccard similarity : ウィキペディア英語版
Jaccard index

The Jaccard index, also known as the Jaccard similarity coefficient (originally coined ''coefficient de communauté'' by Paul Jaccard), is a statistic used for comparing the similarity and diversity of sample sets. The Jaccard coefficient measures similarity between finite sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets:
: J(A,B) = = .
(If ''A'' and ''B'' are both empty, we define ''J''(''A'',''B'') = 1.)
: 0\le J(A,B)\le 1.
The MinHash min-wise independent permutations locality sensitive hashing scheme may be used to efficiently compute an accurate estimate of the Jaccard similarity coefficient of pairs of sets, where each set is represented by a constant-sized signature derived from the minimum values of a hash function.
The Jaccard distance, which measures ''dis''similarity between sample sets, is complementary to the Jaccard coefficient and is obtained by subtracting the Jaccard coefficient from 1, or, equivalently, by dividing the difference of the sizes of the union and the intersection of two sets by the size of the union:
: d_J(A,B) = 1 - J(A,B) = .
An alternate interpretation of the Jaccard distance is as the ratio of the size of the symmetric difference A \triangle B = (A \cup B) - (A \cap B) to the union.
This distance is a metric on the collection of all finite sets.
There is also a version of the Jaccard distance for measures, including probability measures. If \mu is a measure on a measurable space X, then we define the Jaccard coefficient by J_\mu(A,B) = , and the Jaccard distance by d_\mu(A,B) = 1 - J_\mu(A,B) = . Care must be taken if \mu(A \cup B) = 0 or \infty, since these formulas are not well defined in that case.
== Similarity of asymmetric binary attributes ==
Given two objects, ''A'' and ''B'', each with ''n'' binary attributes, the Jaccard coefficient is a useful measure of the overlap that ''A'' and ''B'' share with their attributes. Each attribute of ''A'' and ''B'' can either be 0 or 1. The total number of each combination of attributes for both ''A'' and ''B'' are specified as follows:
:M_ represents the total number of attributes where ''A'' and ''B'' both have a value of 1.
:M_ represents the total number of attributes where the attribute of ''A'' is 0 and the attribute of ''B'' is 1.
:M_ represents the total number of attributes where the attribute of ''A'' is 1 and the attribute of ''B'' is 0.
:M_ represents the total number of attributes where ''A'' and ''B'' both have a value of 0.

|-
! 1
| style="background:#ffcccc;border:solid;border-right:none;"|M_
| style="background:#66ff66;border:solid;border-top:none;border-left:none;"|M_
|}
Each attribute must fall into one of these four categories, meaning that
:M_ + M_ + M_ + M_ = n.
The Jaccard similarity coefficient, ''J'', is given as
:J = + M_ + M_}.
The Jaccard distance, ''d''''J'', is given as
:d_J = \over M_ + M_ + M_} = 1 - J.

抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Jaccard index」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.